AITopics | normalized score

Collaborating Authors

normalized score

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning

Neural Information Processing SystemsJun-22-2026, 12:06:04 GMT

Offline reinforcement learning (RL) aims to learn a policy from a fixed dataset without additional environment interaction. However, effective offline policy learning often requires a large and diverse dataset to mitigate epistemic uncertainty. Collecting such data demands substantial online interactions, which are costly or infeasible in many real-world domains. Therefore, improving policy learning from limited offline data--achieving high data efficiency--is critical for practical offline RL. In this paper, we propose a simple yet effective plug-and-play pretraining framework that initializes the feature representation of a Q-network to enhance data efficiency in offline RL. Our approach employs a shared Q-network architecture trained in two stages: pretraining a backbone feature extractor with a transition prediction head; training a Q-network--combining the backbone feature extractor and a Q-value head--with any offline RL objective. Extensive experiments on the D4RL, Robomimic, V-D4RL, and ExoRL benchmarks show that our method substantially improves both performance and data efficiency across diverse datasets and domains. Remarkably, with only 10% of the dataset, our approach outperforms standard offline RL baselines trained on the full data.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Transfer in reinforcement learning aims at solving a new target task with no additional learning or sample-efficiently by exploiting agents and information obtained from source tasks. We review a line of research with relevant approaches. This group of approaches reuses policies learned on source tasks for target tasks. Fernández and Veloso [17] suggest an exploration strategy for the learning of a new policy given a new task and learned source policies, where the gain of using each policy is estimated together on-line and one of the policies in the set is selected probabilistically at each step, based on the gain, but they focus on aiding the training of the target policy with samples from the target task rather than improving the zero-shot transfer performance. On the other hand, Dayan [14] introduce successor representations (SRs), state space occupancy representations disentangled from rewards, which allow linear decomposition of value functions.

large language model, machine learning, target task, (21 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

f58c24798220ba724fe05c0fa786227d-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 00:02:18 GMT

machine learning, natural language, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

a5f596699d8d4637532f955c7f2860f4-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-16-2026, 08:16:06 GMT

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.60)

Add feedback

9381fc93ad66f9ec4b2eef71147a6665-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 09:16:30 GMT

architecture, information, trajectory, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)

Industry:

Leisure & Entertainment > Games (0.68)
Leisure & Entertainment > Sports (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

9381fc93ad66f9ec4b2eef71147a6665-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 09:16:22 GMT

information, learning, value prediction, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)

Industry: Leisure & Entertainment > Games (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Revisiting the Minimalist Approach to Offline Reinforcement Learning

Neural Information Processing SystemsFeb-9-2026, 03:22:05 GMT

However, the effect of these design choices on established baselines remains understudied.

arxiv preprint arxiv, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country: North America > United States > Montana (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

This group of approaches reuses policies learned on source tasks for target tasks. There is a series of studies that directly exploits the smoothness ofoptimal valuesacross taskswithfunction approximators. Figure 9: The performance profiles [2, 15] of the inference with GPI and constrained GPI on Reacher. For its use in the zero-shot transfer problem, we first set four fixed goal locations at (0.1,0.0),(0.0,0.1),( Our first observation is that while the transferred agents perform comparably on some tasks, constrained GPI makes significant differences on the others, especially more on the "Harsh" target tasks with many 1's as elements in their task vectors.

approximator, artificial intelligence, reacher, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.36)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Learning Massively Multitask World Models for Continuous Control

Hansen, Nicklas, Su, Hao, Wang, Xiaolong

arXiv.org Artificial IntelligenceDec-3-2025

General-purpose control demands agents that act across many tasks and embodiments, yet research on reinforcement learning (RL) for continuous control remains dominated by single-task or offline regimes, reinforcing a view that online RL does not scale. Inspired by the foundation model recipe (large-scale pretraining followed by light RL) we ask whether a single agent can be trained on hundreds of tasks with online interaction. To accelerate research in this direction, we introduce a new benchmark with 200 diverse tasks spanning many domains and embodiments, each with language instructions, demonstrations, and optionally image observations. We then present \emph{Newt}, a language-conditioned multitask world model that is first pretrained on demonstrations to acquire task-aware representations and action priors, and then jointly optimized with online interaction across all tasks. Experiments show that Newt yields better multitask performance and data-efficiency than a set of strong baselines, exhibits strong open-loop control, and enables rapid adaptation to unseen tasks. We release our environments, demonstrations, code for training and evaluation, as well as 200+ checkpoints.

demonstration, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2511.19584

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games > Computer Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Evaluating Large Language Models on the 2026 Korean CSAT Mathematics Exam: Measuring Mathematical Ability in a Zero-Data-Leakage Setting

Pyeon, Goun, Heo, Inbum, Jung, Jeesu, Hwang, Taewook, Namgoong, Hyuk, Seo, Hyein, Han, Yerim, Kim, Eunbin, Kang, Hyeonseok, Jung, Sangkeun

arXiv.org Artificial IntelligenceDec-2-2025

This study systematically evaluated the mathematical reasoning capabilities of Large Language Models (LLMs) using the 2026 Korean College Scholastic Ability Test (CSAT) Mathematics section, ensuring a completely contamination-free evaluation environment. To address data leakage issues in existing benchmarks, we digitized all 46 questions (22 common and 24 elective) within two hours of the exam's public release, eliminating any possibility of inclusion in model training data. We conducted comprehensive evaluations of 24 state-of-the-art LLMs across varying input modalities (Text-only, Image-only, Text+Figure) and prompt languages (Korean, English). The GPT-5 family models achieved perfect scores (100 points) under a limited set of language-modality configurations, while Grok 4, Qwen 3 235B, and Gemini 2.5 pro also scored above 97 points. Notably, gpt-oss-20B achieved 95.7 points despite its relatively small size, demonstrating high cost-effectiveness. Problem-specific analysis revealed Calculus as the weakest domain with significant performance degradation on 4-point high-difficulty problems. Text input consistently outperformed image input, while prompt language effects varied by model scale. In reasoning enhancement experiments with GPT-5 series, increased reasoning intensity improved performance (82.6->100 points) but quadrupled token usage and drastically reduced efficiency, suggesting that models with minimal reasoning may be more practical. This research contributes: (1) implementation of a completely unexposed evaluation environment, (2) a standardized digitization pipeline that converts human-targeted exam materials into LLM-ready evaluation data, and (3) a practical evaluation perspective integrating performance, cost, and time considerations. Detailed results and model comparisons are available at the 2026 Korean CSAT LLM Evaluation Leaderboard; https://isoft.cnu.ac.kr/csat2026/

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2511.18649

Genre: